Skip to content

feat: add parquet compression option #4287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

liaol
Copy link
Contributor

@liaol liaol commented Jul 9, 2025

Parquet compression can save 50% disk or s3 usage.

CPU overhead (zstd with fatest level in production environment ):

  • write path(ingester flush, compression): < 1%
  • read path (querier?, decompression): 10%
  • compactor: decompression 15%,compression < 1%

Usage:
add compression_algo and compression_level option to yaml or use -pyroscopedb.compression-level cli flag.

Test result for profile.Location (in pkg/phlaredb/compression_test.go), ZSTD Fastest is the best.

Algorithm       | Size(KB)     | Ratio(%)   | Compress(Write) Time | Decompress(Read) Time
----------------+--------------+------------+--------------+-------------
No Compression  | 1375.1       | 100.02     | 6.639104ms   | 20.057387ms
Snappy          | 341.7        | 24.86      | 8.657124ms   | 20.510525ms
LZ4 Fastest     | 340.7        | 24.78      | 6.876629ms   | 19.903987ms
LZ4 Default     | 301.1        | 21.90      | 104.934595ms | 13.72565ms
LZ4 Best        | 301.1        | 21.90      | 98.728766ms  | 12.769037ms
GZIP Fastest    | 239.0        | 17.38      | 6.377262ms   | 15.00412ms
GZIP Default    | 225.2        | 16.38      | 7.914216ms   | 14.851654ms
GZIP Best       | 244.2        | 17.76      | 2.103112012s | 14.874458ms
ZSTD Fastest    | 189.9        | 13.81      | 6.507433ms   | 16.559941ms
ZSTD Default    | 225.5        | 16.41      | 6.7972ms     | 13.503095ms
ZSTD Best       | 200.7        | 14.60      | 20.274187ms  | 13.017883ms

TODO:

  • Compress the symdb.

@liaol liaol requested review from simonswine, korniltsev and a team as code owners July 9, 2025 06:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant